智能论文笔记

Making the Most of Text Semantics to Improve Biomedical Vision--Language Processing

Benedikt Boecking , Naoto Usuyama , Shruthi Bannur , Daniel C. Castro , Anton Schwaighofer , Stephanie Hyland , Maria Wetscherek , Tristan Naumann , Aditya Nori , Javier Alvarez-Valle

分类：计算机视觉 | 自然语言处理

2022-04-21

生物医学中的多模式数据遍布，例如放射学图像和报告。大规模解释这些数据对于改善临床护理和加速临床研究至关重要。与一般领域相比，具有复杂语义的生物医学文本在视觉建模中提出了其他挑战，并且先前的工作使用了缺乏特定领域语言理解的适应性模型不足。在本文中，我们表明，有原则的文本语义建模可以大大改善自我监督的视力 - 语言处理中的对比度学习。我们发布了一种实现最先进的语言模型，从而通过改进的词汇和新颖的语言预测客观的客观利用语义和话语特征在放射学报告中获得了自然语言推断。此外，我们提出了一种自我监督的联合视觉 - 语言方法，重点是更好的文本建模。它在广泛的公开基准上建立了新的最新结果，部分是通过利用我们新的特定领域的语言模型。我们释放了一个新的数据集，该数据集具有放射科医生的局部对齐短语接地注释，以促进生物医学视觉处理中复杂语义建模的研究。广泛的评估，包括在此新数据集中，表明我们的对比学习方法在文本语义建模的帮助下，尽管仅使用了全球对准目标，但在细分任务中的表现都优于细分任务中的先验方法。

translated by 谷歌翻译

Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing

Robert Tinn , Hao Cheng , Yu Gu , Naoto Usuyama , Xiaodong Liu , Tristan Naumann , Jianfeng Gao , Hoifung Poon

分类：自然语言处理 | 机器学习

2021-12-15

动机：生物医学研究人员和临床从业者的常年挑战是随着出版物和医疗票据的快速增长而待的。自然语言处理（NLP）已成为驯服信息超载的有希望的方向。特别是，大型神经语言模型通过预先绘制的文本预测，通过各种NLP应用中的BERT模型的成功示例，便于通过预先绘制的预先来进行学习。然而，用于结束任务的微调此类模型仍然具有挑战性，特别是具有小标记数据集，这些数据集是生物医学NLP的常见。结果：我们对生物医学NLP的微调稳定性进行了系统研究。我们表明FineTuning性能可能对预先预订的设置敏感，尤其是在低资源域中。大型型号有可能获得更好的性能，但越来越多的模型大小也加剧了FineTuning不稳定性。因此，我们对解决微调不稳定的技术进行了全面的探索。我们表明，这些技术可以大大提高低源生物医学NLP应用的微调性能。具体地，冻结下层有助于标准伯特基型号，而完整的衰减对于BERT-LARD和Electra型号更有效。对于低资源文本相似性任务，如生物，重新初始化顶层是最佳策略。总体而言，占星型词汇和预制促进更强大的微调模型。基于这些调查结果，我们在广泛的生物医学NLP应用方面建立了新的技术。可用性和实施：为了促进生物医学NLP的进展，我们释放了我们最先进的预订和微调模型：https://aka.ms/blurb。

translated by 谷歌翻译

E-commerce users' preferences for delivery options

Yuki Oyama , Daisuke Fukuda , Naoto Imura , Katsuhiro Nishinari

分类：机器学习

2022-12-30

Many e-commerce marketplaces offer their users fast delivery options for free to meet the increasing needs of users, imposing an excessive burden on city logistics. Therefore, understanding e-commerce users' preference for delivery options is a key to designing logistics policies. To this end, this study designs a stated choice survey in which respondents are faced with choice tasks among different delivery options and time slots, which was completed by 4,062 users from the three major metropolitan areas in Japan. To analyze the data, mixed logit models capturing taste heterogeneity as well as flexible substitution patterns have been estimated. The model estimation results indicate that delivery attributes including fee, time, and time slot size are significant determinants of the delivery option choices. Associations between users' preferences and socio-demographic characteristics, such as age, gender, teleworking frequency and the presence of a delivery box, were also suggested. Moreover, we analyzed two willingness-to-pay measures for delivery, namely, the value of delivery time savings (VODT) and the value of time slot shortening (VOTS), and applied a non-semiparametric approach to estimate their distributions in a data-oriented manner. Although VODT has a large heterogeneity among respondents, the estimated median VODT is 25.6 JPY/day, implying that more than half of the respondents would wait an additional day if the delivery fee were increased by only 26 JPY, that is, they do not necessarily need a fast delivery option but often request it when cheap or almost free. Moreover, VOTS was found to be low, distributed with the median of 5.0 JPY/hour; that is, users do not highly value the reduction in time slot size in monetary terms. These findings on e-commerce users' preferences can help in designing levels of service for last-mile delivery to significantly improve its efficiency.

translated by 谷歌翻译

Generative Colorization of Structured Mobile Web Pages

Kotaro Kikuchi , Naoto Inoue , Mayu Otani , Edgar Simo-Serra , Kota Yamaguchi

分类：计算机视觉

2022-12-22

Color is a critical design factor for web pages, affecting important factors such as viewer emotions and the overall trust and satisfaction of a website. Effective coloring requires design knowledge and expertise, but if this process could be automated through data-driven modeling, efficient exploration and alternative workflows would be possible. However, this direction remains underexplored due to the lack of a formalization of the web page colorization problem, datasets, and evaluation protocols. In this work, we propose a new dataset consisting of e-commerce mobile web pages in a tractable format, which are created by simplifying the pages and extracting canonical color styles with a common web browser. The web page colorization problem is then formalized as a task of estimating plausible color styles for a given web page content with a given hierarchical structure of the elements. We present several Transformer-based methods that are adapted to this task by prepending structural message passing to capture hierarchical relationships between elements. Experimental results, including a quantitative evaluation designed for this task, demonstrate the advantages of our methods over statistical and image colorization methods. The code is available at https://github.com/CyberAgentAILab/webcolor.

translated by 谷歌翻译

Local Differential Privacy Image Generation Using Flow-based Deep Generative Models

Hisaichi Shibata , Shouhei Hanaoka , Yang Cao , Masatoshi Yoshikawa , Tomomi Takenaga , Yukihiro Nomura , Naoto Hayashi , Osamu Abe

分类：计算机视觉

2022-12-20

Diagnostic radiologists need artificial intelligence (AI) for medical imaging, but access to medical images required for training in AI has become increasingly restrictive. To release and use medical images, we need an algorithm that can simultaneously protect privacy and preserve pathologies in medical images. To develop such an algorithm, here, we propose DP-GLOW, a hybrid of a local differential privacy (LDP) algorithm and one of the flow-based deep generative models (GLOW). By applying a GLOW model, we disentangle the pixelwise correlation of images, which makes it difficult to protect privacy with straightforward LDP algorithms for images. Specifically, we map images onto the latent vector of the GLOW model, each element of which follows an independent normal distribution, and we apply the Laplace mechanism to the latent vector. Moreover, we applied DP-GLOW to chest X-ray images to generate LDP images while preserving pathologies.

translated by 谷歌翻译

EOD: The IEEE GRSS Earth Observation Database

Michael Schmitt , Pedram Ghamisi , Naoto Yokoya , Ronny Hänsch

分类：计算机视觉

2022-09-26

在深度学习时代，注释的数据集已成为遥感社区的关键资产。在过去的十年中，发表了许多不同的数据集，每个数据集都为特定的数据类型以及特定的任务或应用程序设计。在遥感数据集的丛林中，很难跟踪已经可用的内容。在本文中，我们介绍了EOD -IEEE GRSS地球观察数据库（EOD） - 一个交互式在线平台，用于分类不同类型的数据集利用遥感图像。

translated by 谷歌翻译

Aging prediction using deep generative model toward the development of preventive medicine

Hisaichi Shibata , Shouhei Hanaoka , Yukihiro Nomura , Naoto Hayashi , Osamu Abe

分类：计算机视觉

2022-08-23

从出生到死亡，由于老化，我们都经历了令人惊讶的无处不在的变化。如果我们可以预测数字领域的衰老，即人体的数字双胞胎，我们将能够在很早的阶段检测病变，从而提高生活质量并延长寿命。我们观察到，没有一个先前开发的成年人体数字双胞胎在具有深层生成模型的体积医学图像之间明确训练的纵向转换规则，可能导致例如心室体积的预测性能不佳。在这里，我们建立了一个新的成人人体的数字双胞胎，该数字双胞胎采用纵向获得的头部计算机断层扫描（CT）图像进行训练，从而从一个当前的体积头CT图像中预测了未来的体积头CT图像。我们首次采用了三维基于流动的深层生成模型之一，以实现这种顺序的三维数字双胞胎。我们表明，我们的数字双胞胎在相对较短的程度上优于预测心室体积的最新方法。

translated by 谷歌翻译

Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution

Xiaoyu Dong , Naoto Yokoya , Longguang Wang , Tatsumi Uezato

分类：计算机视觉

2022-07-19

自我监督的跨模式超分辨率（SR）可以克服获得配对训练数据的困难，但由于只有低分辨率（LR）源和高分辨率源（HR）指导图像，因此具有挑战性。现有方法利用伪或LR空间中的弱监督，因此提供了模糊或不忠于源方式的结果。为了解决这个问题，我们提出了一个相互调制的SR（MMSR）模型，该模型通过相互调制策略来解决任务，包括源至指南调制和指南对源调制。在这些调制中，我们开发了跨域自适应过滤器，以完全利用跨模式的空间依赖性，并有助于诱导源以模拟指南的分辨率并诱导指南模仿源的模态特征。此外，我们采用周期一致性约束，以完全自我监督的方式训练MMSR。各种任务的实验证明了我们的MMSR的最新性能。

translated by 谷歌翻译

Stable Long-Term Recurrent Video Super-Resolution

Benjamin Naoto Chiche , Arnaud Woiselle , Joana Frontera-Pons , Jean-Luc Starck

分类：计算机视觉 | 机器学习

2021-12-16

经常性模型在基于深度学习（DL）的视频超分辨率（VSR）中获得了普及，因为它们增加了与基于滑动窗口的模型相比的计算效率，时间接收场和时间一致性。然而，当推断出在呈现低运动的长视频序列（即场景的某些部分几乎移动）时，经常性模型通过复发处理发散，产生高频伪像。据我们所知，没有关于VSR的研究指出这个不稳定问题，这对于一些现实世界的应用来说可能是至关重要的。视频监控是一个典型的示例，在那里发生这种伪像，因为相机和场景长时间保持静止。在这项工作中，我们将现有的经常性VSR网络的稳定性暴露在具有低运动的长序列上。我们在新的长序列数据集准静态视频集上演示了它，我们创建了。最后，我们介绍了一种基于Lipschitz稳定性理论的稳定和竞争的重复的VSR网络的新框架。我们提出了一种新的经常性VSR网络，基于此框架，Coined中继视频超分辨率（MRVSR）。我们经验展示了具有低运动的长序列的竞争性能。

translated by 谷歌翻译

Computational Complexity of Normalizing Constants for the Product of Determinantal Point Processes

Naoto Ohsaka , Tatsuya Matsuoka

分类：机器学习

2021-11-28

我们考虑测定点过程（DPP）的产物，该点过程，其概率质量与多矩阵的主要成本的产物成比例，作为DPP的天然有希望的推广。我们研究计算其归一化常量的计算复杂性，这是最重要的概率推理任务。我们的复杂性 - 理论结果（差不多）排除了该任务的有效算法的存在，除非输入矩阵被迫具有有利的结构。特别是，我们证明了以下内容：（1）计算$ \ sum_s \ det（{\ bf a} _ {s，s，s}）^ p $完全针对每个（固定）阳性甚至整数$ p $ up-hard和Mod $ _3 $ p-hard，它给Kulesza和Taskar提出的打开问题给出了否定答案。（2）$ \ sum_s \ det（{\ bf a} _ {s，s}）\ det（{\ bf b} _ {s，s}）\ det（{\ bf c} _ {s，s} ）$ IS难以在2 ^ {o（| i | i | ^ {1- \ epsilon}）} $或$ 2 ^ {o（n ^ {1 / epsilon}）} $的任何一个$ \ epsilon> 0 $，其中$ | i | $是输入大小，$ n $是输入矩阵的顺序。这种结果比Gillenwater导出的两个矩阵的#P硬度强。（3）有$ k ^ {o（k）} n ^ {o（1）} $ - 计算$ \ sum_s \ det的时间算法（{\ bf a} _ {s，s}）\ det（ {\ bf b} _ {s，s}）$，其中$ k $是$ \ bf a $和$ \ bf b $的最大等级，或者由$ \ bf a $的非零表项形成的图表的树宽和$ \ bf b $。据说这种参数化算法是固定参数的易解。这些结果可以扩展到固定尺寸的情况。此外，我们介绍了两个固定参数批量算法的应用程序给定矩阵$ \ bf a $ treewidth $ w $：（4）我们可以计算$ 2 ^ {\ frac {n} {2p-1} $ - 近似值到$ \ sum_s \ det（{\ bf a} _ {s，s}）^ p $ for任何分数$ p> 1 $以$ w ^ {o（wp）} n ^ {o（1）} $时间。（5）我们可以在$ w ^ {o（w \ sqrt n）} n ^ {

translated by 谷歌翻译